\h[1][Cirodown]

A Markup language to write multipage or single page HTML / PDF books, blogs, etc. that is saner and more powerful than Markdown and Asciidoctor, but still nicer to write than XML and JSON, with reference implementation in JavaScript.

The source for this document is at: \a[https://github.com/cirosantilli/cirodown/blob/master/README.ciro] and a rendered version can be seen at: \a[https://cirosantilli.com/cirodown].

\toc

\h[2][Quick start]

Install NPM package and use it from the command line for a quick conversion:

\C[[
npm install cirodown
printf 'ab\ncd\n' | cirodown --body-only
]]

Same with an API call in \a[api_hello.js]:

\C[[
npm install cirodown
./api_hello.js
]]

Play with it in a still crappy \x[editor-with-preview][live browser editor] (currently only works from a clone):

\C[[
git clone https://github.com/cirosantilli/cirodown-template
cd cirodown-template
npm install
./build
xdg-open editor.html
]]

For a minimal but convenient project template with a \c[.gitignore] and \c[package.json], consider:

\C[[
git clone https://github.com/cirosantilli/cirodown-template
cd cirodown-template
cirodown .
]]

Convert the master README.ciro and \x[includes][includes] into a single HTML page and view it:

\C[[
git clone https://github.com/cirosantilli/cirodown
cd cirodown
npm link
npm link cirodown
./cirodown --html-single-page README.ciro > index.html
xdg-open index.html
]]

or render instead as one separate HTML pages per include:

\C[[
./cirodown .
xdg-open index.html
]]

The \c[npm link] commands allow you to make changes to the code without re-installing the package all the time for development! Try hacking \a[cirodown] to see it. Just remember that if you add a new dependency, you must redo the symlinking business:

\C[[
npm install <dependency>
npm link
npm link browserify-hello-world
]]

Asked if there is a better way at: \a[https://stackoverflow.com/questions/59389027/how-to-interactively-test-the-executable-of-an-npm-node-js-package-during-develo].

The symlink business can be undone with:

\C[[
npm unlink
rm node_modules/cirodown
]]

TODO: create a multifile blog example folder and link from here.

\h[2][Design goals]

Cirodown is designed entirely to allow writing complex professional HTML and PDF scientific books, blogs, articles and encyclopedias.

Cirodown aims to be the ultimate LaTeX "killer", allowing books to be finally published as either HTML or PDF painlessly (LaTeX being only a backend to PDF generation), supporting out of the box feature such as:

\l[\x[internal-cross-references][references] to \x[headers][headers], \x[images][images], etc. with error checking: never break internal links gain]
\l[KaTeX server side \x[math][Mathematics]]
\l[multi file features out of the box so you don't need a separate wrapper like Jekyll to make a multi-page website:
  \l[\x[internal-cross-file-references]]
  \l[\x[table-of-contents][table of contents] that crosses input files]
  \l[\c[inotifywait] watch rebuild server]
  \l[cross file includes that work as links on multi output file mode, and true includes in single output file mode]
  \l[cross file configuration files to factor out common page parts like headers and footers]
]

It is meant to be both saner and more powerful than Markdown and Asciidoctor.

The tradeoff for those features is that the language is slightly heavier to read and write.

It is intended that this will be an acceptable downside as Cirodown will be used primarily large complex content such as books rather than forum posts, and will therefore primarily written either:

\l[in text editors locally, where users have more features than in random browser textareas]
\l[in a dedicated website that will revolutionize education, and therefore have a good JavaScript editing interface: \a[https://github.com/cirosantilli/write-free-science-books-to-get-famous-website]]

\h[3][Saner]

To be saner than both Markdown and Asciidoctor, Cirodown has exactly five magic characters, with similar functions as in LaTeX:

\l[\c[\\] backslash to start a macro, like LaTeX]
\l[\c[\{] and \c[\}]: left and right square brackets to delimit optional macro arguments]
\l[\c[\[] and \c[\]]: left and right curly braces bracket to start an optional arguments]

And double blank newlines for \x[paragraphs][paragraphs] if you are pedantic.

We would like to have only square brackets for both optional and mandatory to have even less magic characters, but that would make the language difficult to parse for computer and humans. LaTeX was right for once!

This produces a very regular syntax that is easy to learn, including doing:

\l[arbitrary nesting of elements]
\l[adding arbitrary properties to elements]

This sanity also makes the end tail learning curve of the endless edge cases found in Markdown and Asciidoctor disappear.

The language is designed to be philosophically isomorphic to HTML to:

\l[further reduce the learning curve]
\l[ensure that most of HTML constructs can be reached, including arbitrary nesting]

More precisely:

\l[macro names map to tag names, e.g.: \c[\\a] to \c[<a]]
\l[
  one of the arguments of macros, maps to the content of the HTML element, and the others map to attributes.

  E.g., in a link:

  \C[[\a[http://example.com][Link text\]]]

  the first macro argument:

  \C[http://example.com]

  maps to the \c[href] of \c[<a], and the second macro argument:

  \C[Link text]

  maps to the internal content of \c[<a>Link text<>]
]

\h[3][More powerful]

The \x[saner][high sanity of Cirodown], also makes creating new macro extensions extremely easy and intuitive.

All built-in language features use the exact same API as new extensions, which ensures that the extension API is sane forever.

Markdown is clearly missing many key features such as block attributes and \x[internal-cross-references], and has no standardized extension mechanism.

The "more powerful than Asciidoctor" part is only partially true, since Asciidoctor is very featureful can do basically anything through extensions.

The difference is mostly that Cirodown is completely and entirely focused on making amazing scientific books, and so will have key features for that application out-of-the box, notably:

\l[amazing header/ToC/ID features including proper error reports: never have a internal broken link or duplicate ID again]
\l[\x[math]]
\l[\x[publish]: we take care of website publishing for you out-of-the-bos, no need for extra stuff like Jekyll mess]

Another advantage over Asciidoctor is that the reference implementation of Cirodown is in JavaScript, and can therefore be used on browser live preview out of the box. Asciidoctor does Transpile to JS with \a[https://github.com/opal/opal][Opal], but who wants to deal with that layer of complexity?

\h[2][Paragraphs \c[[\p]]]
{id=paragraphs}

OK, this is too common, so we opted for some insanity here: double newline is a paragraph!

Paragraph 1.

Paragraph 2.

Equivalently however, you can use an explicit \c[[\p]] macros as well, which is required for example to add properties to a paragraph, e.g.:

\p{id=paragraph-1}[Paragraph 1]
\p{id=paragraph-2}[Paragraph 2]

Paragraphs are created automatically inside \x[macro-argument-syntax][macro argument] whenever a double newline appears.

Due to their insane syntax, paragraphs have to do some magic to determine where they start and end. TODO: mention the rule that paragraphs wrap only phrasing content, link to: \x[help-macros].

\h[2][Links \c[[\a]]]
{id=links}

Autolink (link text is the same as the link): \a[http://example.com].

Link with custom text: \a[http://example.com][my custom link text].

\a[http://example.com][Multiple

paragraphs]

\h[2][Internal cross references \c[[\x]]]
{id=internal-cross-references}

Every macro in Cirodown can have an optional \c[id] and a \c[title] property.

Those that have a \c[title] but no \c[id], get an auto-generated ID from the title: \x[automatic-id-from-title].

For macros that do have an ID, you can write a cross reference to it:

\h[3][x child argument]

TODO implement.

Setting \c[child=1] on a cross reference to a header as in:

\C[[
\xref[my-header]{child=1}
]]

has the following effects:

\l[add the current header to the list of extra parents of the child]

This allows a section to have multiple parents, e.g. to include it into multiple categories.

For example:

\C[[[
\h[1][Animals]

\h[2][Mammals]

\h[3][Dog]

\h[3][Cat]

\h[2][Cute animals]

Cute animals:

\l[\xref[dog]{child=1}]
\l[\xref[cat]]
]]]

would render something like:

\C[[[
h1 Animals

h2 Mammals

h3 Dog (parent section: Mammals)
(other parent sections: Cute animals)

h3 Cat (parent section: Mammals)

\h[2][Cute animals]

\l[\xref[dog]]
\l[\xref[cat]]
]]]

so note how "Dog" has a list of extra parents including "Cute animals", but Cat does not, due to the \c[child=1].

This property does not affect how the \x[table-of-contents][\c[\\toc]] is rendered. We could insert elements sections there multiple times, but it has the downside that browser Ctrl + F searches would hit the same thing multiple times on the table of contents, which might make finding things harder.

\C[[
\h[2][My title]{id=my-id}

Read this \x[my-id][amazing section].
]]

If the second argument, the \c[content], is not present, it expand to the header title, e.g.:

\C[[
\h[2][My title]{id=my-id}

Read this \x[my-id].
]]

is the same as:

\C[[
\h[2][My title]{id=my-id}

Read this \x[my-id][My title].
]]

\h[3][Cross reference style]

To also show the section auto-generated number as in "Section X.Y My title" we add the optional \c[[{style=full}]] \x[positional-vs-named-arguments][named parameter] to the cross reference, for example:

\C[[
\h[2][My title]{id=my-id}

Read this \x[my-id]{style=full}.
]]

Example: \x[cross-reference-style]{style=full}.

\h[3][Internal cross file references]

Reference to the first header of another file: \x[not-readme].

Reference to the first header of another file that is a second inclusion: \x[included-by-not-readme].

Reference to another header of another file, with \x[cross-reference-style] equals "full": \x[h2-in-not-the-readme]{style=full}.

Reference to an image in another file: \x[fig-not-readme-xi]{style=full}.

Remember that the \x[the-toplevel-header][ID of the toplevel header] is automatically derived from its file name, that's why we have to use:

\C[[
\x[not-readme]
]]

instead of:

\C[[
\x[not-the-readme]
]]

Reference to an internal header of another file: \x[h2-in-not-the-readme]. By default, That header ID gets prefixed by the ID of the top header.

When using \x[html-single-page] mode, the cross file references end up pointing to an ID inside the current HTML element, e.g.:

\C[[
<a href="#not-readme">
]]

rather than:

\C[[
<a href="not-readme.html/#not-readme">
]]

This is why IDs must be unique for elements across all pages.

\h[4][Internal cross file references internals]

When running in Node.js, Cirodown dumps the IDs of all processed files to a \c[out/ids.sqlite3] file, and then reads from that file when IDs are needed.

When converting a directory, \c[out/ids.sqlite3] is placed inside the input directory. When converting individual files, \c[out/ids.sqlite3] is placed in the current working directory. The file is not created or used when handling input from stdin.

When running in the browser, the same JavaScript API will send queries to the server instead of a local SQLite database.

To inspect the the ID database to debug it, you can use:

\C[[
sqlite3 out/db.sqlite3 .dump
]]

\h[3][Automatic ID from title]

If a \x[the-toplevel-header][non-toplevel] macro has the \c[title] property is present but no explicit \c[id], an ID is created automatically from the \c[title], by applying the following transformations:

\l[convert all of \c[A-Z] characters to lowercase]
\l[convert consecutive sequences of all non \c[a-z0-9] ASCII characters to a single hyphen \c[-]. Note that this leaves non-ASCII character untouched.]
\l[strip leading or trailing hyphens]

Note how those rules leave non ASCII Unicode characters untouched, as capitalization and determining if something "is a letter or not" in those cases can be tricky.

So for example, the following automatic IDs would be generated: \x[table-examples-of-automatically-generated-ids].

\table{title=Examples of automatically generated IDs}
[
\tr[
  \th[title]
  \th[id]
  \th[comments]
]
\tr[
  \td[My favorite title]
  \td[my-favorite-title]
  \td
]
\tr[
  \td[Ciro's markdown is awesome]
  \td[ciro-s-markdown-is-awesome]
  \td
]
\tr[
  \td[The École Polytechnique]
  \td[the-École-polytechnique]
  \td[We leave the non ASCII uppercase \a[https://en.wikipedia.org/wiki/Acute_accent][acute accented] \c[e], \c[É], untouched by default]
]
]

For \x[the-toplevel-header][the toplevel header], its ID is derived from the basename of the Cirodown file without extension instead of from the \c[title] argument.

\h[2][Headers \c[[\h]]]
{id=headers}

\h[3][Unlimited header levels]

There is no limit to how many levels we can have!

HTML is randomly limited to \c[h6], so Cirodown just renders higher levels as an \c[h6] with a \c[data-level] attribute to indicate the actual level for possible CSS styling:

\C[[
<h6 data-level="7">My title</h6>
]]

\h[4][My h4]

\h[5][My h5]

\h[6][My h6]

\h[7][My h7]

\h[8][My h8]

\h[9][My h9]

\h[10][My h10]

\h[11][My h11]

\h[12][My h12]

\h[13][My h13]

\h[3][Table of contents \c[[\toc]]]
{id=table-of-contents}

Only one ToC shows per document. Any ToC besides the first one is ignored.

The ToC ignores \c[[\h[1\]]] by default, as we encourage that header level to appear only once and represent the main title under which the entire document goes.

\h[4][id_scope]

TODO implement.

If this header attribute is \c[true], then the ID of all children are prefixed with the ID of this header + a slash \c[/].

This property is true by default for cross file references, although it can be turned off explicitly with \c[id-scope=false].

References withing a single scope do not need the parent scope prefix.

\h[3][Header explicit levels vs nesting design choice]

Arguably, the language would be even saner if we did:

\C[[
\h[My h1][

Paragraph.

\h[My h2][]
]
]]

rather than having explicit levels.

But we chose not to do it like most markups available because it leads to too many nesting levels, and hard to determine where you are without tooling.

\h[3][Includes \c[[\include]]]
{id=includes}

The \c[[\include]] macro allows including an external Cirodown headers under the current header.

It exists to allow optional single page HTML output while still retaining the ability to:

\l[split up large input files into multiple files to make renders faster during document development]
\l[suggest an optional custom output split with one HTML output per Cirodown input, in order to avoid extremely large HTML pages which could be slow to load]

\c[[\include]] takes one mandatory argument, which is the relative path without extension to the \c[.ciro] file that you want to include.

Things would have been a bit more clean if the argument would take the ID of the header you want to include rather than the filename, which is analogous to how \x[internal-cross-references][\c[\\x]] works, but that would mean that a single page build would require an initial parse to determine IDs, so we are just going with the easier option of pointing out the file name directly.

Headers of the included document are automatically shifted to match the level of the child of the level where they are being included.

If \x[html-single-page] is given, the external document is rendered embedded into the current document directly, essentially as if the source had been copy pasted (except for small corrections such as the header offsets).

Otherwise, the following effects happen:

\l[
  The headers of the included tree appear in the \x[table-of-contents][table of contents] of the document as links to the corresponding external files.

  This is implemented simply by reading a previously generated database file much like \x[internal-cross-file-references-internals], which avoids the slowdown of parsing all included files every time.

  As a result, you have to do an initial parse of all files in the project to extract their headers however, just as you would need to do when linking to those headers.
]
\l[the include itself renders as a link to the included document]

\l[\x[html-single-page]]

Here is an example of inclusion of the files \c[not-readme.ciro] and \c[not-readme-2.ciro].

\include[not-readme]

\include[not-readme-2]

\h[3][Skipping header levels]

The very first header of a document can be of any level, although we highly recommend your document to start with a \c[[\h[1\]]], and to contain exactly just one \c[[\h[1\]]], as this has implications such as:

\l[\c[[\h[1\]]] is used for the document title: \x[html-document-title]]
\l[\c[[\h[1\]]] does not show on the \x[table-of-contents]]

After the initial header however, you must not skip a header level, e.g. the following would give an error because it skips level 3:

\C[[
\h[1][my 1]

\h[2][my 1]

\h[4][my 4]
]]

\h[3][The toplevel header]

If the document has only a single header of the highest level, e.g. like the following has only a single \c[h2]:

\C[[
\h[2][My 2]

\h[3][My 3 1]

\h[3][My 3 2]
]]

then this has some magical effects.

\h[4][The toplevel header IDs don't show]

Header IDs won't show for the toplevel level. For example, the headers would render like:

\C[[
My 2

1. My 3 1

2. My 3 2
]]

rather than:

\C[[
1. My 2

1.2. My 3 1

1.2. My 3 2
]]

This is because in this case, we guess that the \c[h2] is the toplevel

\h[4][The ID of the first header is derived from the filename]

TODO: we kind of wanted this to be the ID of the toplevel header instead of the first header, but this would require an extra postprocessing pass (to determine if the first header is toplevel or not), which might affect performance, so we are not doing it right now.

When the Cirodown input comes from a file (and not e.g. stdin), the default ID of the first header in the document is derived from the basename of the Cirodown input source file rather than from its title.

This is specially relevant when \x[includes][including] other files.

For example, in file named \c[my-file.ciro] which contains:

\C[[
\h[1][Awesome cirodown file]
]]

the ID of the header is \c[my-file] rather than \c[awesome-cirodown-file]. See also: \x[automatic-id-from-title].

If the file is an \x[index-files][index file], then the basename of the parent directory is used instead, e.g. the toplevel ID of a file:

\C[[my-subdir/README.ciro]]

would be:

\C[[#my-subdir]]

rather than:

\C[[#README.ciro]]

\h[2][Lists \c[[\l]]]
{id=lists}

With implicit container:

\l[a]
\l[b]
\l[c]

Equivalent with explicit container:

\ul[
\l[a]
\l[b]
\l[c]
]

The explicit container is required if you want to add properties to the list, e.g. a title and an ID: \x[list-my-id]:

\ul
{id=list-my-id}
[
\l[a]
\l[b]
\l[c]
]

It is also required if you want ordered lists:

\ol[
\l[first]
\l[second]
\l[third]
]

Nested lists with implicit containers:

\l[
  a

  \l[a1]
  \l[a2]
  \l[a2]
]
\l[b]
\l[c]

List item with a paragraph inside of it:

\l[a]
\l[
  I have

  Multiple paragraphs.

  \l[And]
  \l[also]
  \l[a]
  \l[list]
]
\l[c]

\h[2][Images \c[[\image]]]
{id=images}

A block image with \x[macro-capitalization-for-block-vs-inline][capital] 'i' \c[Image] can be seen at \x[fig-my-xi-chrysanthemum]:

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/Chrysanthemum_Xi_Jinping_with_black_red_liusi_added_by_Ciro_Santilli.jpg]
{title=Xi Chrysanthemum is a very nice image}
{id=fig-my-xi-chrysanthemum}
{source=https://commons.wikimedia.org/wiki/File:Lotus_flower_(978659).jpg}
{description=
  We can have multiple paragraphs here, just for any other parameter argument.

  I'm not even kidding.
}

Here is one without a description but with an ID so we can link to it: \x[fig-my-xi-chrysanthemum-2]{style=full}.

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/Chrysanthemum_Xi_Jinping_with_black_red_liusi_added_by_Ciro_Santilli.jpg]
{id=fig-my-xi-chrysanthemum-2}

We must use \x[cross-reference-style][\c[style=full]] here because otherwise the link text would be empty and not show at all.

If the image has neither ID nor title, then it just gets a generic caption, and it is not possible to link to it with an \x[internal-cross-references][internal cross reference], e.g.:

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/Chrysanthemum_Xi_Jinping_with_black_red_liusi_added_by_Ciro_Santilli.jpg]

The image does however get an automatically generated ID based on its image number so that readers can link to it on the rendered version, e.g. as:

\C[[
#fig-123
]]

This link is of course not stable across document revisions, since if an image is added before that one, the link will break.

This kind of image is discouraged, because in paged output formats like PDF, it could float away from the text that refers to the image.

And here is an \image[https://raw.githubusercontent.com/cirosantilli/media/master/Chrysanthemum_Xi_Jinping_with_black_red_liusi_added_by_Ciro_Santilli.jpg][Xi Chrysanthemum] inline one with lower case 'i'. Inline images can't have captions.

\h[3][Where to store images]

If you are making a limited repository that will not have a ton of images, then you can get away with simply git tracking your images in the main repository.

With this setup, no further action is needed. For example, with a file structure of:

\C[[
./README.ciro
./Image_with_a_descriptive_title.jpeg
]]

just use the image from \C[README.ciro] as:

\C[[
Here is a nice image:

\Image[Image_with_a_descriptive_title.jpeg]
]]

However, if you are making a huge tutorial, which can have a huge undefined number of images (i.e. any scientific book), then you likely don't want to git track your images in the git repository.

We recommend the following approach instead.

Create a separate GitHub repository in addition to the main one containing the text, for example:

\l[\c[./my-tutorial/]]
\l[\c[./my-tutorial-media/]]

The name \c[*-media] suffix is not mandatory, but if you use this default, \c[cirodown] will handle it for you without any further configuration.

And then set the \x[media-source-type] option in your \x[cirodown-json]:

\C[[
{
  media_source_type = "media_repo"
}
]]

Now, simply drop your images into the \c[my-tutorial-media] repository and use them just as before, and everything will just work!

\C[[
\Image[Image_with_a_descriptive_title.jpeg]
]]

\c[cirodown] will even automatically add and push used images in the \c[my-tutorial-media] repository for you during publishing!

You should then use the following rules inside \c[my-tutorial-media]:

\l[give every file a very descriptive and unique name as a full English sentence]
\l[never ever delete any files, nor change their content, unless it is an improvement in format that does change the information contained of the image TODO link to nice Wikimedia Commons guideline page]

This way, even though the repositories are not fully in sync, anyone who clones the latest version of the \c[*-media] directory will be able to view any version of the main repository.

Then, if one day the media repository ever blows up GitHub's limit, you can just migrate the images to another image server that allows arbitrary basenames, e.g. AWS, and just configure your project to use that new media base URL with the \x[media-base-url] option.

The reason why images should be kept in a separate repository is that images are hundreds or thousands of times larger than hand written text.

Therefore, images could easily fill up the maximum repository size you are allowed: https://webapps.stackexchange.com/questions/45254/file-size-and-storage-limits-on-github#84746 and then what will you do when GitHub comes asking you to reduce the repository size?

\a[https://git-lfs.github.com/][Git LFS] is one approach to deal with this, but we feel that it adds too much development overhead.

\h[3][Image generators]

TODO implement: mechanism where you enter a textual description of the image inside the code body, and it then converts to an image, adds to the \c[-media] repo and pushes all automatically. Start with dot.

\h[2][Code \c[[\c]] and \c[[\C]]]
{id=code}

Inline code blocks (code blocks that should appear in the middle of a paragraph) are done with lower case \c[c]:

\C[[
My \c[inline] code.
]]

renders to:

\C[[
<p>My <code>inline</code> code.</p>
]]

Code blocks (code blocks that should appear outside of paragraphs in their own lines) are done with capital \c[C]:

\C[[
A paragraph.

\C[
A block
of code
]

Another paragraph.
]]

renders to:

\C[[
<p>A paragraph.</p>
<pre><code>A block
of code</code></pre>
<p>Another paragraph.</p>
]]

If the content of the code block has many characters that you would need to \x[escape-characters][escape], you will often want to use \x[literal-arguments], which work just like the do for any other argument. For example:

\C[[[
A paragraph.

\C[[
And now, some long, long code, with lots
of chars that you would need to escape:
\ [  ] {  }
]]

A paragraph.
]]]

Note that the initial newline is skipped automatically in code blocks, just as for any other element, due to: \x[argument-leading-newline-removal], so you don't have to worry about it.

The capital vs lower case theme is also used in other elements, for example \x[math].

The distinction between inline \c[c] and block \c[C] code blocks is needed because in HTML, \a[https://stackoverflow.com/questions/5371787/can-i-have-a-pre-tag-inside-a-p-tag-in-tumblr/58603596#58603596][\c[pre] cannot go inside \c[p]].

We could have chosen to do some magic to differentiate between them, e.g. checking if the block is the only element in a paragraph, but we decided not to do that to keep the language saner.

\h[2][Mathematics \c[[\M]] and \c[[\m]]]
{id=math}

Via KaTeX server side, oh yes!

My inline \m[[\sqrt{1 + 1}]] is awesome.

Escape the closing bracket with backslash: \m[1 - \[1 + 1\] = -1].

Escape the closing bracket with double open and close: \m[[1 - [1 + 1] = -1]] is awesome.

Display math is done with upper case \c[M]:

\M[[
\sqrt{1 + 1} \\
\sqrt{1 + 1}
]]

HTML escaping happens as you would expect, e.g. < shows fine in:

\C[[[
\M[[
1 < 2
]]
]]]

which renders as:

\M[[
1 < 2
]]

Equation IDs and titles and linking to equations works identically to \x[images][images], see that section for full details. Here is one equation reference example: \x[eq-my-first-equation].

\M{title=My first equation}[[
1 + 1 = 2
]]

\h[3][Math defines across blocks]

TODO pending upstream tag bump: \a[https://github.com/KaTeX/KaTeX/pull/2091] First here is an invisible block defining with a \c[\\newcommand] definition after this paragraph:

\M{show=0}[[
\newcommand{\foo}[0]{bar}
]]

We set the \c[\{show=0\}] argument because this block only contains KaTeX definitions, and should not render to anything.

Second block using:

\C[[[
\M[[
\foo
]]
]]]

First invisible block defining with \c[\\def] after this paragraph:

\M{show=0}[[
\gdef\foogdef{bar}
]]

Second block using it:

\M[[
\foogdef
]]

\h[2][Tables \c[[\table]], \c[[\tr]], \c[[\th]] and \c[[\td]]]
{id=tables}

Similar to lists. Implicit container with optional indentation to improve readability:

\tr[
  \th[Header 1]
  \th[Header 2]
]
\tr[
  \td[1 1]
  \td[1 2]
]
\tr[
  \td[2 1]
  \td[2 2]
]

Explicit container to add further properties: \x[table-my-table].

\table{title=My table title}
{id=table-my-table}
[
\tr[
  \th[Header 1]
  \th[Header 2]
]
\tr[
  \td[1 1]
  \td[1 2]
]
\tr[
  \td[2 1]
  \td[2 2]
]
]

\h[2][Comments]

The \c[Comment] and \c[comment] macros are regular macros that does not produce any output. Capitalization is explained at: \x[macro-capitalization-for-block-vs-inline].

You will therefore mostly want to use it with a \x[literal-arguments][literal argument], which will, as for any other macro, ignore any macros inside of it.

\Comment[[
One line and \m[1 + 1]

\l[a]
\l[b]

Another line.
]]

And here is: \comment[[\m[1 + 1\]]] an inline one.

\comment[[\m[1 + 1\]]] inline at the start.

\h[2][Cirodown syntax]

\h[3][Macro argument syntax]

\h[4][Positional vs named arguments]

Every argument in Cirodown is either positional or named.

For example, in a \x[headers][header] definition with an ID:

\C[[
\h[1][My asdf]{id=asdf qwer}{id_scope=myscope}
]]

we have:

\l[two positional argument: \c[[[1]]] and \c[[[My asdf]]]. Those are surrounded by \c[[[]]] and have no name]
\l[
  two named arguments: \c[[{id=asdf qwer}]] and \c[[{id_scope=myscope}]].

  The first one has name \c[id] and the mandatory separator \c[=], followed by the value \c[asdf qwer].
]

You can determine if a macro is positional or named by using \x[help-macros]. Its output contains something like:

\C[[
  "h": {
    "name": "h",
    "positional_args": [
      {
        "name": "level"
      },
      {
        "name": "content"
      }
    ],
    "named_args": {
      "id": {
        "name": "id"
      }
      "id_scope": {
        "name": "id_scope"
      }
    },
]]

and so we see that \c[level] and \c[content] are positional arguments, and \c[id] and \c[id_scope] are named arguments.

Generally, positional arguments are few (otherwise it would be hard to know which is which is which), and are almost always used for a given element so tha they save us from typing the name too many times.

The order of positional arguments must of course be fixed, but named arguments can go anywhere. We can even mix positional and named arguments however we want, although this is not advised for clarity.

The following are therefore all equivalent:

\C[[
\h[1][My asdf]{id=asdf qwer}{id_scope=myscope}
\h[1][My asdf]{id_scope=myscope}{id=asdf qwer}
\h{id=asdf qwer}{id_scope=myscope}[1][My asdf]
\h{id_scope=myscope}[1]{id=asdf qwer}[My asdf]
]]

Just like named arguments, positional arguments are never mandatory.

If not given, most positional arguments will default to an empty string.

However, some positional arguments can have special effects if not given.

For example, an anchor with the first positional argument present (the URL), but not the second positional argument (the link text) as in:

\C[[
\a[http://example.com]
]]

has the special effect of generating automatic links as in:

\C[[
\a[http://example.com][http://example.com]
]]

See also: \x[links].

\h[4][JavaScript interface for arguments]

The JavaScript interface sees arguments as follows:

\C[
function macro_name(args)
]

where args is a dict such that:

\l[optional arguments have the key / value pairs explicitly given on the call]
\l[
  mandatory arguments have a key documented by the API, and the value on the call.

  For example, the link API names its arguments \c[href] and \c[text].
]

\h[4][Newlines between arguments]

The macro name and the first argument, and two consecutive arguments, can be optionally separated by exactly one newline character, e.g.:

\C[[
\h[2]
{id-scope=true}
[Design goals]
]]

is equivalent to:

\C[[
\h[2]{id-scope=true}[Design goals]
]]

and this non-recommended mixed style:

\C[[
\h[2]{id-scope=true}
[Design goals]
]]

This allows to greatly improve the readability of long argument lists by having them one per line.

\h[4][Escape characters]

For non-literal macro arguments, the rule is simple, you must escape all of:

\l[\c[\\]]
\l[\c[\[] and \c[\]]]
\l[\c[\{] and \c[\}]]

This is good for short arguments of regular text, but for longer blocks of \x[code] or \x[math], you may want to use \x[literal-arguments]

\h[4][Literal arguments]

Arguments that are opened with more than one square brackets \c[\[] or curly braces \c[\{] are literal arguments.

In literal arguments, Cirodown is not parsed, and the entire argument is considered as text until a corresponding close with the same number of characters.

Therefore, you cannot have nested content, but it makes it extremely convenient to write \x[code] or \x[math].

For example, a multiline code block with double open and double close square brackets inside can be enclosed in triple square brackets:

\C[[[
A literal argument looks like this in Cirodown:

\C[[
\C[
A multiline

cod block.
]
]]

And another paragraph.
]]]

The same works for inline code:

\C[[[
The program \c[[puts("]");]] is very complex.
]]]

Within literal blocks, only one thing can be escaped with backslashes: leading \c[\[] or trailing \c[\]].

The rule is that:

\l[if the first character of a literal argument is a sequence of \c[\\], and it is followed by another argument open character (e.g. \c[\[]), remove the first \c[\\] and treat the other characters as regular text]
\l[if the last character of a literal argument is a \c[\\], ignore it and treat the following closing character (e.g. \c[\]]) as regular text]

See the following open input / output:

\C[[[
\c[[\ b]]
<code>\ b</code>

\c[[\a b]]
<code>\a b</code>

\c[[\[ b]]
<code>[ b</code>

\c[[\\[ b]]
<code>\[ b</code>

\c[[\\\[ b]]
<code>\\[ b</code>
]]]

and close examples:

\C[[[[
\c[[a \]]
<code>a \</code>

\c[[a \]]]
<code>a ]</code>

\c[[a \\]]]
<code>a \]</code>
]]]]

\h[4][Argument leading newline removal]

If the very first character of an argument is a newline, then that character is ignored.

For example:

\C[[[
\C[[
a

b
]]
]]]

generates something like:

\C[
<pre><code>a

b
</code></pre>
]

instead of:

\C[
<pre><code>
a

b
</code></pre>
]

This is extremely convenient to improve the readability of code blocks and similar constructs.

If you absolutely need an opening newline, just add a second leading or trailing newline to the macro argument, e.g.:

\C[[[
\C[[

a

b

]]
]]]

\h[4][Argument automatic indentation removal]

Inside of a non-literal block, the very first non-whitespace character determines the indentation level of the content.

This allows seamlessly indenting complex nested content to make it more readable.

For example, a list with complex content could be written without indentation as:

\C[[[
\l[a]
\l[
b

\C[[
And now some code
]]

And a paragraph.
]
\l[c]
]]]

but it would be more readable as the equivalent:

\C[[[
\l[a]
\l[
  b

  \c[[
  And now some code
  ]]

  And a paragraph.
]
\l[c]
]]]

If a something tries to reduce the current indentation level, then that leads to an error:

\C[[
\l[a]
\l[
  b

I'm bad because I have negative indentation.

  Back to good.
]
\l[c]
]]

\h[3][Macro capitalization for block vs inline]

This is just a naming convention without any magic attached to it. It is currently used for macros such as:

\l[\x[math]]
\l[\x[code]]
\l[\x[comments]]

We haven't found a sane way of getting rid of that for now, so just for now just force users to explicitly select between them.

The distinction is required because there are lots of things you cannot put inside paragraphs in HTML.

\h[2][Tooling]

Unlike all languages which rely on ad-hoc tooling, we will support every single tool that is required and feasible to be in this repository in this repository, in a centralized manner.

\h[3][\c[cirodown] executable]
{id=cirodown-executable}

Convert a \c[.ciro] file to HTML and output the HTML to stdout:

\C[[
cirodown README.ciro
]]

Convert all \c[.ciro] files in a directory to HTML files next to each corresponding \c[.ciro] file, e.g. \c[somefile.ciro] to \c[somefile.html]:

\C[[
cirodown .
]]

In order to resolve \x[internal-cross-file-references], this actually does two passes:

\l[first an ID extraction pass, which parses all inputs and dumps their IDs to the ID database]
\l[then a second render pass, which uses the IDs in the ID database]

Convert a \c[.ciro] file from stdin to HTML and output the contents of \c[<body>] to stdout:

\C[[
printf 'ab\ncd\n' | cirodown --body-only
]]

\h[4][Index files]

The following basenames are considered "index files":

\l[\c[README.ciro]]
\l[\c[index.ciro]]

Those basenames have the following magic properties:

\l[the default output file name for an index file in HTML output is always \c[index.html], including that of \c[README.ciro]. This way it will appear on both the root of the HTML output and on the GitHub home page once GitHub adds Cirodown support.]
\l[the default \x[the-toplevel-header][toplevel header] ID of an index files is derived from the parent directory basename rather than from the source file basename]

\h[4][\c[cirodown] executable most important options]
{id=cirodown-executable-most-important-options}

\h[5][\c[--watch]]
{id=watch}

Don't quit \c[cirodown] immediately.

Instead, watch the selected file or directory for changes, and rebuild individual files when changes are detected.

Watch a single file:

\C[[
cirodown --watch README.ciro
]]

Now you can just edit \c[README.ciro], save the file in your editor, and refresh the webpage and your change should be visible, no need to run a \c[cirodown] command excplicitly every time.

TODO: integrate Live Preview: https://asciidoctor.org/docs/editing-asciidoc-with-live-preview/ to also dispense the browser refresh.

Watch every \c[.ciro] file in an entire directory:

\C[[
cirodown --watch .
]]

This automatically first does a first ID extractoin pass on all files to support \x[internal-cross-file-references].

The output of conversion to \c[.html] files is saved automatically instead of outputting it to stdout as done without \c[--watch].

Exit by entering Ctrl + C on the terminal.

\h[5][\c[--publish]]
{id=publish}

Cirodown tooling is so amazing that we also take care of the HTML publishing for you.

\h[6][Publish to GitHub Pages]

If you want to publish your root user page, which appears at `/` (e.g. \a[https://github.com/cirosantilli/cirosantilli.github.io] for the user \c[cirosantilli]), GitHub annoyingly forces you to use the \c[master] branch for the HTML output:

\l[https://github.com/isaacs/github/issues/212]
\l[https://stackoverflow.com/questions/31439951/how-can-i-use-a-branch-other-than-master-for-user-github-pages]

Therefore, you must first place your \c[.ciro] source in a different branch and clear up the \c[master] branch for the HTML, and only then publish:

\C[[
git checkout -b dev
git branch -D master
git push --delete origin master

git add README.ciro
git commit --message 'some commit message'
cirodown --publish
]]

This will automatically \c[git push] both the \c[dev] branch for you, and create/update the \c[master] branch, which contains the desired HTML output.

Only changes committed to Git are pushed.

You then will also want to set your default repository branch to \c[dev] in the settings for that repository: https://help.github.com/en/github/administering-a-repository/setting-the-default-branch

\c[cirodown] automatically detects that we are not on the \c[master] branch, and therefore guesses that you want to publish GitHub pages on the \c[master] branch.

For any other non-root directory, \c[cirodown] sees that you are on the \c[master] branch, and automatically publishes on the \c[gh-pages] branch, so all you need is:

\C[[
git add README.ciro
git commit --message 'some commit message'
cirodown --publish .
]]

The publishing only happens if the build has no errors.

When \c[--publish] is given, stdin input is not accepted, and so the current directory is built by default, i.e. the following two are equivalent:

\C[[
./cirodown --publish
./cirodown --publish .
]]

\h[5][\c[--publish-commit <commit-message>]]
{id=publish-commit}

Like \x[publish], but also automatically:

\l[\c[git add -u] to automatically add change to any files that have been previously git tracked]
\l[\c[git commit -m <commit-message>] to create a new commit with those changes]

This allows you to publish your changes live in a single command such as:

\C[[
cirodown --publish-commit 'my amazing change' .
]]

With great power comes great responsibility of course, but who cares!

\h[5][\c[--help-macros]]
{id=help-macros}

You can get an overview of all macros in JSON format with:

\C[[
cirodown --help-macros
]]

\h[5][\c[--html-embed]]
{id=html-embed}

Embed as many external resources as possible into a single HTML file.

The use case for this option is to produce a single HTML file for an entire build that is fully self contained, and can therefore be given to consumers and viewed offline, much like a PDF.

Examples of embeddings done:

\l[
  CSS and JavaScript are copy pasted in place into the HTML.

  The default built-in CSS and JavaScript files used by Cirodown (e.g. the KaTeX CSS \x[math][used for mathematics]) are currently all automatically downloaded as NPM package dependencies to cirodown

  Without \c[--html-embed], those CSS and JavaScript use their main cloud CDN URLs, and therefore require Internet connection to view the generated documents.

  The embeded version of the document can be viewed offline however.
]
\l[
  \x[images][images] are downloaded if needed and embedded as \c[data:] URLs.

  Images that are managed by the project itself and already locally present, such as those inside the project itself or due to \x[media-source-type] usually don't require download.

  For images linked directly from the web, we maintain a local download cache, and skip downloads if the image is already in the cache.

  To re-download due to image updates, use either:

  \l[\c[--asset-cache-update]: download all images such that the local disk timestamp is older than the HTTP modification date with \a[https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since][\c[If-Modified-Since]]]
  \l[\c[--asset-cache-update-force]: forcefully redownload all assets]
]

Keep in mind that certain things can never be embedded, e.g.:

\l[YouTube videos, since YouTube does not offer any download API]

\h[5][\c[--html-single-page]]
{id=html-single-page}

If given:

\l[the \x[includes][\c[include]] macro parses the target file to be included and includes it in-place]
\l[\x[internal-cross-file-references] are disabled, and the cross file ID database does not get updated.

It should be possible to work around this, but we are starting with the simplest implementation that forbids it.

The problem those cause is that the IDs of included headers show as duplicate IDs of those in the ID database.

This should be OK to start with because the more common use case with \c[--html-sinle-page] is that of including all headers in a single document.
]

Otherwise, \c[include] only adds the headers of the other file to the table of contents of the current one, but not the body of the other file. The ToC entries then point to the headers of the included external files.

You may want to use this option together with \x[html-embed] to produce fully self-contained individual HTML files for your project.

\h[5][\c[--no-html-x-extension]]
{id=html-x-no-extension}

If not given, \x[internal-cross-references] render with the \c[.html] extension as in:

\C[[
<a href=not-readme.html#h2-in-not-the-readme>
]]

This way, those links will work when rendering locally to \c[.html] files which is the default behaviour of:

\C[[
cirodown .
]]

If given however, the links render without the \c[.html] as in:

\C[[
<a href=not-readme#h2-in-not-the-readme>
]]

which is what is needed for servers such as GitHub Pages, which automatically remove the \c[.html] extension from paths.

This option is automatically implied when publishing to targets that remove the \c[.html] extension such as GitHub pages.

\h[4][\c[cirodown.json]]
{id=cirodown-json}

\h[5][\c[media_source_type]]
{id=media-source-type}

\l[\c[local]: tracked in the current Git repository, no further magic is done]
\l[
  \c[media_repo]: tracked in a separate media repository, defaults to \c[../<project-name>-media/]

  When \x[publish][publishing], Cirodown searches for files that were used in the build but not added to the media directory, and automatically does a \c[git add], \c[commit] and \c[push] for you!
]

\h[5][\c[media_base_url]]
{id=media-base-url}

TODO implement set the base URL where media will be served from.

Default: for a GitHub repository called `my-tutorial`, append the `-media` suffix and use `my-tutorial-media`.

Further rationale at: \x[where-to-store-images].

\h[3][Editor with preview]

TODO

We must achieve an editor setup with synchronized live side-by-side preview.

Likely, we will first do a non WYSIWYG editor with side by side preview with scroll sync.

Then, if the project picks up steam, we can start considering a full WYSISYG.

It would be amazing to have a WebKit interface that works both on browser for the and locally.

Possibilities we could reuse:

\l[
  Editor.js

  Returns JSON AST!

  \l[website: https://editorjs.io/ json output]
  \l[source: https://github.com/codex-team/editor.js]
  \l[WYSIWYG: no]
  \l[preview scroll sync: yes]
]
\l[
  StackEdit

  \l[website: \a[https://stackedit.io]]
  \l[source: https://github.com/benweet/stackedit]
  \l[demo: https://stackedit.io/app]
  \l[WYSIWYG: no]
  \l[preview scroll sync: yes]
]
\l[
  Editor.md

  \l[website: \a[https://github.com/pandao/editor.md]]
  \l[source: \a[https://github.com/pandao/editor.md]]
  \l[demo: \a[https://pandao.github.io/editor.md]]
  \l[WYSIWYG: no]
  \l[preview scroll sync: yes but buggy when tested 2019-12-12 on live website]
]
\l[
  Quill.md

  \l[website: \a[https://quilljs.com]]
  \l[source: \a[https://github.com/pandao/editor.md]]
  \l[demo: \a[https://pandao.github.io/editor.md]]
  \l[WYSIWYG: yes]
  \l[markdown output: no https://github.com/quilljs/quill/issues/74]
]
\l[https://ui.toast.com/tui-editor/]
\l[https://www.froala.com/wysiwyg-editor]

\h[2][Developing Cirodown]

\h[3][Test system]

Run all tests:

\C[[
npm test
]]

List all tests:

\C[[
node node_modules/mocha-list-tests/mocha-list-tests.js main.js
]]

As per: \a[https://stackoverflow.com/questions/41380137/list-all-mocha-tests-without-executing-them/58573986#58573986].

Run just one test by name:

\C[[
npm test -- 'one paragraph'
]]

As per: https://stackoverflow.com/questions/10832031/how-to-run-a-single-test-with-mocha TODO: what if the test name is a substring?

Step debug during a test run. Add the statement:

\C[[
debugger;
]]

to where you want to break in the code, and then:

\C[[
node inspect ./node_modules/.bin/mocha test --ignore-leaks "-g" "p with id before"
]]

\h[3][Overview of files in this repository]

\l[\a[index.js]: main code. Must be able to run in the browser, so no Node.js specifics. Exposes the central \c[convert] function]
\l[\a[cirodown]: CLI executable. Is basically just a CLI interface frontend to \c[convert]]
\l[\a[test.js]: contains all the Mocha tests, see also: \x[test-system]]
\l[\a[README.md]: minimal Markdown README until GitHub / NPM support Cirodown :-)]

\h[3][Internals API]

Tokenized token stream and AST can be obtained as JSON from the API.

Errors can be obtained as JSON from the API.

Everything that you need to write Cirodown tooling, is present in the main API.

All tooling will be merged into one single repo.

\h[3][The \c[[\toplevel]] implicit macro]
{id=toplevel}

Every Cirodown document is implicitly put inside a \c[[\toplevel]] document and:

\l[any optionally given arguments at the very beginning of the document will be treated as arguments of the \c[[\toplevel]] macro]
\l[anything else will be put inside the \c[content] argument of the \c[[\toplevel]] macro]

E.g., a Cirodown document that contains:

\C[[
[title=My favorite title]

And now, some content!
]]

is morally equivalent to:

\C[[
\toplevel
[title=My favorite title]
[
And now, some content!
]
]]

In terms of HTML, the \c[\\toplevel] element corresponds to the \c[<html>], \c[<head>], \c[<header>] and \c[<footer>] elements of a document.

\h[4][HTML document title]

\l[if the \c[title] argument of \c[toplevel] is given, use that]
\l[otherwise, if the document has a \c[[\h[1\]]], use the title of the first such header]
\l[otherwise use a dummy value]

\h[3][CSS]

Our CSS is located at \a[main.scss] and gets processed through \a[https://sass-lang.com/][Sass].

To generate the CSS during development after any changes to that file, you must run:

\C[[
npm sass
]]

\h[3][Formal grammar]

TODO. Describe Cirodown's formal grammar, and classify it in the grammar hierarchy and parsing complexity.

\h[3][TODO]

\l[expose git revision, especially to multipage template]
\l[multipage template language that cannot run forever like Liquid. Maybe: \a[https://github.com/janl/mustache.js/]]
\l[cirodown : google analytics]
\l[custom header, footer and HEAD based on files like header.html, footer.html]
\l[\x[id-scope]]
\l[publish browser live quick test, Can be done by: hack convert_path_to_file to forward not just cirodown HTML output but also tracked HTML, CSS, JS, PNG, JPG, files, and then add an explicit include in the cirodown.json to force inclusion of out/index.js and out/main.min.css]
\l[list and code caption]
\l[image description and source]
\l[input indentation skip]
\l[correct ID generation algorithm to be Unicode robust]
\l[local downloads and single HTML page with image / script embeds]
\l[image management: upload to separate media directory]
\l[handle ID calculation from title if title has HTML elements, e.g. headers with code. Use that for the title as well. This has to be done with a context argument that makes implementing macros output just the text without tags]
\l[multiple newlines blow up (5 or six), create test and decide best behaviour]
\l[autolinks http]
\l[paragraphs rendered as divs, can show anything inside e.g. lists, code blocks]
\l[create mechanism to link to the GitHub blob view of in-tree source files that are not added to GitHub pages, e.g. \c[\\file]]
\l[avoid infinite recursion with xref]
\l[update tests to test the AST output rather than HTML. HTML too complex and variable]
\l[parse_error: add in document error message, currently only to stderr]
\l[LaTeX output]
\l[infinite xref header recursion: \a[https://github.com/asciidoctor/asciidoctor/issues/3543] Solution: forbid x in titles, pass down context from parent.]
\l[on the AST, put contents of headers inside headers]
\l[parallel build across files on \c[./cirodown .]]
\l[includes without double newline between them]
\l[link from first toplevel header of document to parents determined by includes]

\h[2][Related projects]

\l[\a[https://github.com/rstudio/bookdown], \a[https://bookdown.org/]. Very similar feature set to what we want!!! Transpiles to markdown, and then goes through Pandoc: \a[https://bookdown.org/yihui/bookdown/pandoc.html], thus will never run on browser without huge translation layers. But does have an obscene ammount of output formats however.]
\l[\a[https://gohugo.io/][Hugo]. Pretty good, similar feature set to ours. But Go based, so hard on browser, and adds adhoc features on top of markdown once again]

Less related but of interest:

\l[\a[http://www.uprtcl.io/]]
\l[\a[https://libretexts.org]]
